Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today’s sequencing and array-based profiling methods present major challenges to visualization tools.
The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution.
A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data.
Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues.
IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems.
IGV is freely available for download under a GNU LGPL open-source license.
In this tutorial we are going to learn how to use IGV to visualize genomic data. The first thing to do is create a directory to store all the tutorial data. It is good practice to create a new directory for each project you work on, this ensures files do not get mixed up and all the results are self-contained.
Create a ‘tutorial’ directory to store output files:
mkdir tutorial
Download the tutorial and exercise data:
curl https://raw.githubusercontent.com/zifornd/bioinformatics-workshop/main/data/visualization/data.tar.gz --output tutorial/data.tar.gz
## % Total % Received % Xferd Average Speed Time Time Time Current
## Dload Upload Total Spent Left Speed
##
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 13.8M 0 4149 0 0 3595 0 1:07:11 0:00:01 1:07:10 3610
1 13.8M 1 214k 0 0 116k 0 0:02:01 0:00:01 0:02:00 116k
2 13.8M 2 405k 0 0 142k 0 0:01:39 0:00:02 0:01:37 142k
3 13.8M 3 544k 0 0 141k 0 0:01:40 0:00:03 0:01:37 141k
4 13.8M 4 683k 0 0 139k 0 0:01:41 0:00:04 0:01:37 139k
5 13.8M 5 804k 0 0 137k 0 0:01:42 0:00:05 0:01:37 170k
6 13.8M 6 921k 0 0 134k 0 0:01:45 0:00:06 0:01:39 141k
7 13.8M 7 1048k 0 0 133k 0 0:01:46 0:00:07 0:01:39 128k
8 13.8M 8 1164k 0 0 131k 0 0:01:47 0:00:08 0:01:39 123k
8 13.8M 8 1185k 0 0 119k 0 0:01:58 0:00:09 0:01:49 99k
8 13.8M 8 1222k 0 0 112k 0 0:02:05 0:00:10 0:01:55 85274
8 13.8M 8 1265k 0 0 106k 0 0:02:12 0:00:11 0:02:01 70440
9 13.8M 9 1311k 0 0 101k 0 0:02:18 0:00:12 0:02:06 53700
9 13.8M 9 1348k 0 0 99222 0 0:02:26 0:00:13 0:02:13 37318
10 13.8M 10 1455k 0 0 97k 0 0:02:24 0:00:14 0:02:10 55814
11 13.8M 11 1600k 0 0 100k 0 0:02:20 0:00:15 0:02:05 77541
13 13.8M 13 1856k 0 0 110k 0 0:02:08 0:00:16 0:01:52 117k
14 13.8M 14 2015k 0 0 112k 0 0:02:05 0:00:17 0:01:48 140k
15 13.8M 15 2191k 0 0 116k 0 0:02:01 0:00:18 0:01:43 170k
17 13.8M 17 2463k 0 0 124k 0 0:01:54 0:00:19 0:01:35 203k
19 13.8M 19 2816k 0 0 135k 0 0:01:44 0:00:20 0:01:24 243k
21 13.8M 21 3039k 0 0 139k 0 0:01:41 0:00:21 0:01:20 237k
23 13.8M 23 3295k 0 0 144k 0 0:01:38 0:00:22 0:01:16 257k
24 13.8M 24 3487k 0 0 146k 0 0:01:36 0:00:23 0:01:13 258k
25 13.8M 25 3599k 0 0 144k 0 0:01:37 0:00:24 0:01:13 227k
26 13.8M 26 3807k 0 0 147k 0 0:01:36 0:00:25 0:01:11 198k
29 13.8M 29 4111k 0 0 153k 0 0:01:32 0:00:26 0:01:06 213k
29 13.8M 29 4239k 0 0 152k 0 0:01:33 0:00:27 0:01:06 187k
30 13.8M 30 4319k 0 0 149k 0 0:01:34 0:00:28 0:01:06 166k
31 13.8M 31 4399k 0 0 147k 0 0:01:36 0:00:29 0:01:07 159k
32 13.8M 32 4544k 0 0 147k 0 0:01:36 0:00:30 0:01:06 147k
33 13.8M 33 4800k 0 0 150k 0 0:01:33 0:00:31 0:01:02 137k
35 13.8M 35 5056k 0 0 153k 0 0:01:31 0:00:32 0:00:59 164k
37 13.8M 37 5359k 0 0 158k 0 0:01:29 0:00:33 0:00:56 208k
39 13.8M 39 5535k 0 0 158k 0 0:01:29 0:00:34 0:00:55 228k
40 13.8M 40 5711k 0 0 159k 0 0:01:28 0:00:35 0:00:53 233k
41 13.8M 41 5919k 0 0 160k 0 0:01:28 0:00:36 0:00:52 224k
42 13.8M 42 6063k 0 0 159k 0 0:01:28 0:00:37 0:00:51 197k
43 13.8M 43 6095k 0 0 156k 0 0:01:30 0:00:38 0:00:52 147k
43 13.8M 43 6111k 0 0 153k 0 0:01:32 0:00:39 0:00:53 115k
43 13.8M 43 6144k 0 0 150k 0 0:01:34 0:00:40 0:00:54 88422
44 13.8M 44 6351k 0 0 151k 0 0:01:33 0:00:41 0:00:52 88369
47 13.8M 47 6703k 0 0 156k 0 0:01:30 0:00:42 0:00:48 130k
49 13.8M 49 7040k 0 0 160k 0 0:01:28 0:00:43 0:00:45 187k
50 13.8M 50 7104k 0 0 158k 0 0:01:29 0:00:44 0:00:45 197k
50 13.8M 50 7168k 0 0 156k 0 0:01:30 0:00:45 0:00:45 201k
50 13.8M 50 7183k 0 0 153k 0 0:01:32 0:00:46 0:00:46 166k
51 13.8M 51 7232k 0 0 151k 0 0:01:33 0:00:47 0:00:46 104k
51 13.8M 51 7296k 0 0 149k 0 0:01:34 0:00:48 0:00:46 52734
52 13.8M 52 7360k 0 0 146k 0 0:01:36 0:00:50 0:00:46 50334
52 13.8M 52 7407k 0 0 145k 0 0:01:37 0:00:50 0:00:47 49633
53 13.8M 53 7552k 0 0 145k 0 0:01:37 0:00:51 0:00:46 75473
54 13.8M 54 7711k 0 0 145k 0 0:01:37 0:00:52 0:00:45 98517
55 13.8M 55 7808k 0 0 144k 0 0:01:37 0:00:53 0:00:44 102k
56 13.8M 56 8015k 0 0 146k 0 0:01:36 0:00:54 0:00:42 136k
57 13.8M 57 8079k 0 0 144k 0 0:01:37 0:00:55 0:00:42 134k
57 13.8M 57 8143k 0 0 143k 0 0:01:38 0:00:56 0:00:42 117k
58 13.8M 58 8239k 0 0 142k 0 0:01:39 0:00:57 0:00:42 105k
58 13.8M 58 8320k 0 0 141k 0 0:01:40 0:00:58 0:00:42 102k
59 13.8M 59 8431k 0 0 140k 0 0:01:40 0:00:59 0:00:41 85535
60 13.8M 60 8591k 0 0 141k 0 0:01:40 0:01:00 0:00:40 102k
63 13.8M 63 8927k 0 0 144k 0 0:01:38 0:01:01 0:00:37 158k
65 13.8M 65 9311k 0 0 148k 0 0:01:35 0:01:02 0:00:33 214k
67 13.8M 67 9600k 0 0 150k 0 0:01:34 0:01:03 0:00:31 256k
72 13.8M 72 10.0M 0 0 158k 0 0:01:29 0:01:04 0:00:25 371k
77 13.8M 77 10.7M 0 0 166k 0 0:01:24 0:01:05 0:00:19 477k
80 13.8M 80 11.0M 0 0 169k 0 0:01:23 0:01:06 0:00:17 476k
80 13.8M 80 11.1M 0 0 168k 0 0:01:24 0:01:07 0:00:17 421k
81 13.8M 81 11.2M 0 0 166k 0 0:01:24 0:01:08 0:00:16 374k
81 13.8M 81 11.2M 0 0 165k 0 0:01:25 0:01:09 0:00:16 248k
82 13.8M 82 11.3M 0 0 164k 0 0:01:25 0:01:10 0:00:15 137k
83 13.8M 83 11.6M 0 0 165k 0 0:01:25 0:01:11 0:00:14 110k
86 13.8M 86 12.0M 0 0 168k 0 0:01:23 0:01:12 0:00:11 171k
89 13.8M 89 12.3M 0 0 170k 0 0:01:22 0:01:13 0:00:09 223k
90 13.8M 90 12.5M 0 0 171k 0 0:01:22 0:01:14 0:00:08 261k
92 13.8M 92 12.7M 0 0 171k 0 0:01:22 0:01:15 0:00:07 272k
94 13.8M 94 13.0M 0 0 174k 0 0:01:21 0:01:16 0:00:05 304k
98 13.8M 98 13.6M 0 0 179k 0 0:01:18 0:01:17 0:00:01 332k
100 13.8M 100 13.8M 0 0 181k 0 0:01:18 0:01:18 --:--:-- 371k
Extract the archive file into the tutorial directory:
tar xf tutorial/data.tar.gz --directory=tutorial
The software we are going to use in this tutorial can be installed using the conda package manager. Please refer to the previous conda workshop for details on installing software and creating conda environments.
Create a new environment with IGV installed:
conda create --yes --name igvtools igv
## Collecting package metadata (current_repodata.json): ...working... done
## Solving environment: ...working... done
##
## ## Package Plan ##
##
## environment location: /opt/miniconda3/envs/igvtools
##
## added / updated specs:
## - igv
##
##
## The following NEW packages will be INSTALLED:
##
## igv bioconda/noarch::igv-2.13.2-hdfd78af_0
## libcxx conda-forge/osx-64::libcxx-14.0.6-hccf4f1f_0
## libzlib conda-forge/osx-64::libzlib-1.2.12-hfd90126_3
## openjdk conda-forge/osx-64::openjdk-17.0.3-hbc0c0cd_2
##
##
## Preparing transaction: ...working... done
## Verifying transaction: ...working... done
## Executing transaction: ...working... done
## #
## # To activate this environment, use
## #
## # $ conda activate igvtools
## #
## # To deactivate an active environment, use
## #
## # $ conda deactivate
##
## Retrieving notices: ...working... done
Activate the new environment to use it:
conda activate igvtools
Test that the igv command is available:
which igv
## /opt/miniconda3/envs/igvtools/bin/igv
The developers of IGV have produced a number of tutorial videos which describe the layout and functionality of the browser. Each video is roughly 5 minutes long and contains a lot of useful information. Instead of needlessly creating a new tutorial, we suggest you watch each of the videos instead.
This video demonstrates how to load sequencing data:
This video demonstrates how SNPs and indels are displayed:
This video demonstrates how RNA-seq data is displayed:
This video demonstrates how variant calls are displayed:
In some cases it is useful to control IGV programmatically, rather than interactively. This is handy when you want to perform lots of tasks in the session without having to manually load and navigate the browser. Tasks are issued using a batch script, a text file containing commands which the browser understands. The commands are run sequentially and appear on separate lines.
Below is an example of a batch script. The script starts by creating a new session, performing some tasks, and then exiting the session:
cat tutorial/data/example/script.txt
## new
## genome hg19
## goto chr12:7,939,997-7,953,742
## snapshot tutorial/data/example/snapshot.png
## exit
The script is run by passing it to the IGV launcher with the batch parameter:
igv --batch tutorial/data/example/script.txt
## Using system JDK.
## WARNING: package com.sun.java.swing.plaf.windows not in java.desktop
## WARNING: package sun.awt.windows not in java.desktop
## openjdk version "17.0.3" 2022-04-19 LTS
## OpenJDK Runtime Environment Zulu17.34+19-CA (build 17.0.3+7-LTS)
## OpenJDK 64-Bit Server VM Zulu17.34+19-CA (build 17.0.3+7-LTS, mixed mode, sharing)
## INFO [Sept 24,2022 09:55] [Main] Startup IGV Version user not_set
## INFO [Sept 24,2022 09:55] [Main] Java 17.0.3 (build 17.0.3+7-LTS) 2022-04-19
## INFO [Sept 24,2022 09:55] [Main] Java Vendor: Azul Systems, Inc. http://www.azul.com/
## INFO [Sept 24,2022 09:55] [Main] JVM: OpenJDK 64-Bit Server VM Zulu17.34+19-CA
## INFO [Sept 24,2022 09:55] [Main] OS: Mac OS X 12.6 x86_64
## INFO [Sept 24,2022 09:55] [Main] IGV Directory: /Users/James/igv
## SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
## SLF4J: Defaulting to no-operation (NOP) logger implementation
## SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
## INFO [Sept 24,2022 09:55] [AmazonUtils] AWS default credentials found. AWS support enabled.
## INFO [Sept 24,2022 09:55] [CommandListener] Listening on port 60151
## INFO [Sept 24,2022 09:55] [BatchRunner] Executing batch script: tutorial/data/example/script.txt
## INFO [Sept 24,2022 09:55] [GenomeManager] Loading genome: https://s3.amazonaws.com/igv.org.genomes/hg19/hg19.json
## INFO [Sept 24,2022 09:56] [TrackLoader] Loading resource: https://s3.amazonaws.com/igv.org.genomes/hg19/ncbiRefSeq.sorted.txt.gz
## INFO [Sept 24,2022 09:56] [ShutdownThread] Shutting down
The browser outputs a picture of the Nanog locus in the human genome:
This example is fairly simple, but more complicated tasks can be achieved with additional commands. The full list of batch commands is displayed below.
| Command | Parameters | Description |
|---|---|---|
| collapse | trackName | Collapses a given track. trackName is optional, if it is not supplied all tracks are collapsed. |
| colorBy | option tagName | Sets the color by option for alignment tracks. For option TAG also specify the tag name. |
| echo | Writes echo back to the response, primarily for testing port connections. | |
| exit | Exit (close) the IGV application. | |
| expand | trackName | Expands a given trackName. trackName is optional, however, and if it is not supplied all tracks are expanded. |
| genome | genomeIdOrPath | Selects a genome by id, or loads a genome (or indexed fasta) from the supplied path. |
| goto | locus or listOfLoci | |
| group | option tagName | Alignment tracks only. Group alignments by the specified option. See below for valid option values. For option TAG also specify the tag name. |
| load | file | Loads a data or session file by specifying a full path to a local file or a URL. To explicitly specify a path to an index file use the optional index= parameter. For examples load foo.bam index=bar.bai |
| maxPanelHeight | height | Sets the number of vertical pixels (height) of each panel to include in image. Images created from a port command or batch script are not limited to the data visible on the screen. Stated another way, images can include the entire panel not just the portion visible in the scrollable screen area. The default value for this setting is 1000, increase it to see more data, decrease it to create smaller images. To capture the exact area visible on the screen set this value to -1. |
| new | Create a new session. Unloads all tracks except the default genome annotations. | |
| preference | key value | Temporarily set the preference named key to the specified value. This preference only lasts until IGV is shut down. The complete set of preference keys are listed in the file preferences.tab here. The first column is the key, the third column is the value type. For select value types the permitted values follow as a list delimited by the character |. |
| region | chr start end | Defines a region of interest bounded by the two loci (e.g., region chr1 100 200). |
| saveSession | filename | Save the current session. It is recommended that a full path be used for filename. IGV release 2.11.1 |
| setAltColor | colorString trackName | Set the track altColor, used for negative values in a wig track or negative strand features. See description of setColor below. IGV release 2.11.1 |
| setColor | colorString trackName | Set the track color. colorString can be a comma delimited rgb string with components in the range 0-255, for example 255,0,0, or a hex color string, for example FF0000. IGV release 2.11.1 |
| setDataRange | rangeString trackName | Set the data range (scale) for all numeric tracks, or if a trackName is specified a specific track. rangeString is either a 2 comma delimited list for min,max. As of release 2.11.0 ‘auto’ can be used for rangeString, which will set the track(s) to autoscale. |
| setLogScale | true/false trackName | Set the data scale to log (true) or linear (false). Optionally specify a track, if no track is specified all numeric tracks will be set. |
| setSleepInterval | ms | Sets a delay (sleep) time in milliseconds. The sleep interval is invoked between successive commands. |
| setTrackHeight | height trackName | Set the specified track’s height in integer units. trackName is required. |
| snapshotDirectory | path | Sets the directory in which to write images. |
| snapshot | filename | Saves a snapshot of the IGV window to an image file. If filename is omitted, writes a PNG file with a filename generated based on the locus. If filename is specified, the filename extension determines the image file format, which must be either .png or .svg. |
| sort | option locus | Sorts alignment or segmented copy number tracks. See below for valid option values. If supplied, the locus option can define a single position, or a range. If absent sorting will be based on the region in view for segmented copy number, or the center position of the region in view for alignment tracks. |
| squish | trackName | Squish a given trackName. trackName is optional, and if it is not supplied all annotation tracks are squished. |
| viewaspairs | trackName | Set the display mode for an alignment track to View as pairs. trackName is optional. |
The exercises below are designed to strengthen your knowledge of using IGV and writing batch commands. The solution to each problem is blurred, only after attempting to solve the problem yourself should you look at the solution. Should you need any help, please ask one of the instructors.
ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest.
In the chipseq directory are a set of files generated from the analysis of a ChIP-seq experiment:
All of the data files are compatible with the mm10 mouse reference genome.
Using both the IGV browser and the command line, answer the following questions:
man sort for help sorting.# ZHBTC4_OCT4_UNT_peak_24
sort -k5,5nr tutorial/data/chipseq/ZHBTC4_OCT4_UNT_peaks.narrowPeak | head -n 1
## chr19 7261376 7262037 ZHBTC4_OCT4_UNT_peak_24 1690 . 36.5612 176.877 169.089 326
# Rcor2
man awk for help calculating.# ZHBTC4_OCT4_UNT_peak_149
awk '{print $1, $2, $3, $4, $3 - $2}' tutorial/data/chipseq/ZHBTC4_OCT4_UNT_peaks.narrowPeak | sort -k5,5nr | head -n 1
## chr19 38347912 38349082 ZHBTC4_OCT4_UNT_peak_149 1170
# G
# 3
Exome sequencing, also known as whole exome sequencing (WES), is a genomic technique for sequencing all of the protein-coding regions of genes in a genome (known as the exome). It consists of two steps: the first step is to select only the subset of DNA that encodes proteins. These regions are known as exons. The second step is to sequence the exonic DNA using any high-throughput DNA sequencing technology.
In the exome directory is a VCF file generated from the analysis of a whole-exome sequencing (WES) experiment. The file is called 1KGP.vcf and contains variant calling information from a number of samples from the 1000 Genomes Project data portal. The VCF file is compatible with the hg39 human reference genome.
Load the VCF file in the IGV browser and answer the following questions:
# 10
# Reference: A
# Alternate: G
# 1
# The first SNP is homozygous reference
# chr19:9387798
# chr19:55999276
# Homozygous reference: 1
# Homozygous variant: 3
# Heterozygous variant: 6
RNA-Seq (named as an abbreviation of RNA sequencing) is a sequencing technique which uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing cellular transcriptome.
In the rnaseq directory are a set of coverage tracks generated from the analysis of an RNA-seq experiment. All of the data files are compatible with the mm10 mouse reference genome. Write a batch script to recreate the snapshot shown below:
Click the bash chunk to reveal the solution:
new
genome mm10
goto chr19:8,987,120-9,078,935 chr19:38,052,980-38,076,424 chr19:27,386,697-27,431,908
load tutorial/data/rnaseq/BRG1FL_TAM.bw
setColor #FF0000 BRG1FL_TAM.bw
load tutorial/data/rnaseq/BRG1FL_UNT.bw
setColor #0000FF BRG1FL_UNT.bw
load tutorial/data/rnaseq/ZHBTC4_DOX.bw
setColor #008000 ZHBTC4_DOX.bw
load tutorial/data/rnaseq/ZHBTC4_UNT.bw
setColor #800080 ZHBTC4_UNT.bw
setDataRange 0,50
snapshot exercises/expression/snapshot.png
exit